nuisance-varying family
Meta Learning not to Learn: Robustly Informing Meta-Learning under Nuisance-Varying Families
In settings where both spurious and causal predictors are available, standard neural networks trained under the objective of empirical risk minimization (ERM) with no additional inductive biases tend to have a dependence on a spurious feature. As a result, it is necessary to integrate additional inductive biases in order to guide the network toward generalizable hypotheses. Often these spurious features are shared across related tasks, such as estimating disease prognoses from image scans coming from different hospitals, making the challenge of generalization more difficult. In these settings, it is important that methods are able to integrate the proper inductive biases to generalize across both nuisance-varying families as well as task families. Motivated by this setting, we present RIME (Robustly Informed Meta lEarning), a new method for meta learning under the presence of both positive and negative inductive biases (what to learn and what not to learn). We first develop a theoretical causal framework showing why existing approaches at knowledge integration can lead to worse performance on distributionally robust objectives. We then show that RIME is able to simultaneously integrate both biases, reaching state of the art performance under distributionally robust objectives in informed meta-learning settings under nuisance-varying families.
- Asia > Middle East > Jordan (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- North America > Canada > Ontario > Toronto (0.04)
Predictive Modeling in the Presence of Nuisance-Induced Spurious Correlations
Puli, Aahlad, Zhang, Lily H., Oermann, Eric K., Ranganath, Rajesh
Deep predictive models often make use of spurious correlations between the label and the covariates that differ between training and test distributions. In many classification tasks, spurious correlations are induced by a changing relationship between the label and some nuisance variables correlated with the covariates. For example, in classifying animals in natural images, the background, which is the nuisance, can predict the type of animal. This nuisance-label relationship does not always hold. We formalize a family of distributions that only differ in the nuisance-label relationship and introduce a distribution where this relationship is broken called the nuisance-randomized distribution. We introduce a set of predictive models built from the nuisance-randomized distribution with representations, that when conditioned on, do not correlate the label and the nuisance. For models in this set, we lower bound the performance for any member of the family with the mutual information between the representation and the label under the nuisance-randomized distribution. To build predictive models that maximize the performance lower bound, we develop Nuisance-Randomized Distillation (NURD). We evaluate NURD on a synthetic example, colored-MNIST, and classifying chest X-rays. When using non-lung patches as the nuisance in classifying chest X-rays, NURD produces models that predict pneumonia under strong spurious correlations.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (0.93)
- Health & Medicine > Therapeutic Area (0.90)